Goto

Collaborating Authors

 relational learning


Dynamic Relational Priming Improves Transformer in Multivariate Time Series

arXiv.org Artificial Intelligence

Standard attention mechanisms in transformers employ static token representations that remain unchanged across all pair-wise computations in each layer. This limits their representational alignment with the potentially diverse relational dynamics of each token-pair interaction. While they excel in domains with relatively homogeneous relationships, standard attention's static relational learning struggles to capture the diverse, heterogeneous inter-channel dependencies of multivariate time series (MTS) data--where different channel-pair interactions within a single system may be governed by entirely different physical laws or temporal dynamics. To better align the attention mechanism for such domain phenomena, we propose attention with dynamic relational priming (prime attention). Unlike standard attention where each token presents an identical representation across all of its pair-wise interactions, prime attention tailors each token dynamically (or per interaction) through learnable modulations to best capture the unique relational dynamics of each token pair, optimizing each pair-wise interaction for that specific relationship. This representational plasticity of prime attention enables effective extraction of relationship-specific information in MTS while maintaining the same asymptotic computational complexity as standard attention. Our results demonstrate that prime attention consistently outperforms standard attention across benchmarks, achieving up to 6.5% improvement in forecasting accuracy. In addition, we find that prime attention achieves comparable or superior performance using up to 40% less sequence length compared to standard attention, further demonstrating its superior relational modeling capabilities. An important challenge in applying transformers to multivariate time series (MTS) stems from domain mismatch. In language modeling, token relationships are predominantly semantic in nature, enabling most critical patterns to be captured by simple weighted sums of token representations. Similarly, in computer vision, spatial relationships dominate, enabling attention mechanisms to focus on regions of interest through uniform spatial reasoning. Learning on graphs exhibits comparable homogeneity, where node relationships are fundamentally structural and connectivity-based, allowing standard attention to model interactions through meaningful topological patterns (that are sometimes separated by relationship type (Schlichtkrull et al., 2018; Hu et al., 2020; Wang et al., 2019)). By static, we mean that token representations in each layer are fixed relative to all other tokens throughout pair-wise modeling. We classify this property of standard attention mechanisms as static relational learning.


Privately Learning from Graphs with Applications in Fine-tuning Large Language Models

arXiv.org Artificial Intelligence

Graphs offer unique insights into relationships and interactions between entities, complementing data modalities like text, images, and videos. By incorporating relational information from graph data, AI models can extend their capabilities beyond traditional tasks. However, relational data in sensitive domains such as finance and healthcare often contain private information, making privacy preservation crucial. Existing privacy-preserving methods, such as DP-SGD, which rely on gradient decoupling assumptions, are not well-suited for relational learning due to the inherent dependencies between coupled training samples. To address this challenge, we propose a privacy-preserving relational learning pipeline that decouples dependencies in sampled relations during training, ensuring differential privacy through a tailored application of DP-SGD. We apply this method to fine-tune large language models (LLMs) on sensitive graph data, and tackle the associated computational complexities. Our approach is evaluated on LLMs of varying sizes (e.g., BERT, Llama2) using real-world relational data from four text-attributed graphs. The results demonstrate significant improvements in relational learning tasks, all while maintaining robust privacy guarantees during training. Additionally, we explore the trade-offs between privacy, utility, and computational efficiency, offering insights into the practical deployment of our approach. Code is available at https://github.com/Graph-COM/PvGaLM.


Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective

arXiv.org Artificial Intelligence

Foundation Models (FMs) have demonstrated remarkable insights into the relational dynamics of the world, leading to the crucial question: how do these models acquire an understanding of world hybrid relations? Traditional statistical learning, particularly for prediction problems, may overlook the rich and inherently structured information from the data, especially regarding the relationships between objects. We introduce a mathematical model that formalizes relational learning as hypergraph recovery to study pre-training of FMs. In our framework, the world is represented as a hypergraph, with data abstracted as random samples from hyperedges. We theoretically examine the feasibility of a Pre-Trained Model (PTM) to recover this hypergraph and analyze the data efficiency in a minimax near-optimal style. By integrating rich graph theories into the realm of PTMs, our mathematical framework offers powerful tools for an in-depth understanding of pre-training from a unique perspective and can be used under various scenarios. As an example, we extend the framework to entity alignment in multimodal learning.


Statistical relational learning and neuro-symbolic AI: what does first-order logic offer?

arXiv.org Artificial Intelligence

In this paper, our aim is to briefly survey and articulate the logical and philosophical foundations of using (first-order) logic to represent (probabilistic) knowledge in a non-technical fashion. Our motivation is three fold. First, for machine learning researchers unaware of why the research community cares about relational representations, this article can serve as a gentle introduction. Second, for logical experts who are newcomers to the learning area, such an article can help in navigating the differences between finite vs infinite, and subjective probabilities vs random-world semantics. Finally, for researchers from statistical relational learning and neuro-symbolic AI, who are usually embedded in finite worlds with subjective probabilities, appreciating what infinite domains and random-world semantics brings to the table is of utmost theoretical import.


Relational Learning with Gaussian Processes

Neural Information Processing Systems

Correlation between instances is often modelled via a kernel function using in- put attributes of the instances. Relational knowledge can further reveal additional pairwise correlations between variables of interest. In this paper, we develop a class of models which incorporates both reciprocal relational information and in- put attributes using Gaussian process techniques. This approach provides a novel non-parametric Bayesian framework with a data-dependent covariance function for supervised learning tasks. We also apply this framework to semi-supervised learning.


Hidden Common Cause Relations in Relational Learning

Neural Information Processing Systems

When predicting class labels for objects within a relational database, it is often helpful to consider a model for relationships: this allows for information between class labels to be shared and to improve prediction performance. However, there are different ways by which objects can be related within a relational database. One traditional way corresponds to a Markov network structure: each existing relation is represented by an undirected edge. This encodes that, conditioned on input features, each object label is independent of other object labels given its neighbors in the graph. However, there is no reason why Markov networks should be the only representation of choice for symmetric dependence structures.


Hierarchical Relational Learning for Few-Shot Knowledge Graph Completion

arXiv.org Artificial Intelligence

Knowledge graphs (KGs) are known for their large scale and knowledge inference ability, but are also notorious for the incompleteness associated with them. Due to the long-tail distribution of the relations in KGs, few-shot KG completion has been proposed as a solution to alleviate incompleteness and expand the coverage of KGs. It aims to make predictions for triplets involving novel relations when only a few training triplets are provided as reference. Previous methods have mostly focused on designing local neighbor aggregators to learn entity-level information and/or imposing sequential dependency assumption at the triplet level to learn meta relation information. However, valuable pairwise triplet-level interactions and context-level relational information have been largely overlooked for learning meta representations of few-shot relations. In this paper, we propose a hierarchical relational learning method (HiRe) for few-shot KG completion. By jointly capturing three levels of relational information (entity-level, triplet-level and context-level), HiRe can effectively learn and refine the meta representation of few-shot relations, and consequently generalize very well to new unseen relations. Extensive experiments on two benchmark datasets validate the superiority of HiRe against other state-of-the-art methods.


Overview of DBAI@NeurIPS'21

#artificialintelligence

After two decades of in-RDBMS machine learning research and implementations, database systems have not made a compelling case for data scientists to move their workflows there. A transition phase is currently under way, where the database community with all the experience of the past is looking for crucial features, such as data versioning and data governance, that would make DBMSes attractive to data scientists, and where the definition of in-RDBMS machine learning becomes less rigid with the adoption of data lakes and the interoperability with systems like TensorFlow and open formats like ONNX. Overall, we are very happy with the content of the 1st DBAI, as this included insightful presentations and a constructive panel discussion. I'd like to sincerely thank my fellow organizers (Nikolaos Vasilogou, Parisa Kordjamshidi, Maximilian Schleich, Kirk Pruhs and Zenna Tavares), the PC members, the speakers and panelists, the sponsors, the volunteers and last but not least the authors and attendees for contributing each in his/her own way in making DBAI'21 a successful workshop. I really hope we will have the opportunity to organize another DBAI soon.


Overview of DBAI@NeurIPS'21

#artificialintelligence

Overall, we are very happy with the content of the 1st DBAI, as this included insightful presentations and a constructive panel discussion. I'd like to sincerely thank my fellow organizers (Nikolaos Vasilogou, Parisa Kordjamshidi, Maximilian Schleich, Kirk Pruhs and Zenna Tavares), the PC members, the speakers and panelists, the sponsors, the volunteers and last but not least the authors and attendees for contributing each in his/her own way in making DBAI'21 a successful workshop. I really hope we will have the opportunity to organize another DBAI soon.


From Graph ML to Deep Relational Learning

#artificialintelligence

Graph structured data are all around us. With the recent advent of deep learning, it seems only natural that researchers started to explore this data representation with neural networks, too. Currently, we experience an explosion of the Graph Neural Network (GNN) class, with countless models being proposed under a variety of (catchy) names. Nevertheless, most of these models are based on the same simple graph propagation principle. To look at the problem from a broader view, we will here reveal the underlying GNN principles from the general perspective of Relational Machine Learning, which we discussed in a previous article.